36 research outputs found

    Introduction to StarNEig -- A Task-based Library for Solving Nonsymmetric Eigenvalue Problems

    Full text link
    In this paper, we present the StarNEig library for solving dense non-symmetric (generalized) eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library support GPUs. The library is currently in an early beta state and only real arithmetic is supported. Support for complex data types is planned for a future release. This paper is aimed for potential users of the library. We describe the design choices and capabilities of the library, and contrast them to existing software such as ScaLAPACK. StarNEig implements a ScaLAPACK compatibility layer that should make it easy for a new user to transition to StarNEig. We demonstrate the performance of the library with a small set of computational experiments.Comment: 10 pages, 4 figures (10 when counting sub-figures), 2 tex-files. Submitted to PPAM 2019, 13th international conference on parallel processing and applied mathematics, September 8-11, 2019. Proceedings will be published after the conference by Springer in the LNCS series. Second author's first name is "Carl Christian" and last name "Kjelgaard Mikkelsen

    How accurate does Newton have to be?

    Full text link
    We analyze the convergence of quasi-Newton methods in exact and finite precision arithmetic. In particular, we derive an upper bound for the stagnation level and we show that any sufficiently exact quasi-Newton method will converge quadratically until stagnation. In the absence of sufficient accuracy, we are likely to retain rapid linear convergence. We confirm our analysis by computing square roots and solving bond constraint equations in the context of molecular dynamics. We briefly discuss implications for parallel solvers.Comment: 12 pages, 2 figures, preprint accepted by PPAM 2022, expected to appear in LNCS vol. 13826 during 202

    The explicit Spike algorithm : Iterative solution of the reduced system

    No full text
    The explicit Spike algorithm applies to narrow banded linear systems which are strictly diagonally dominant by rows. The parallel bottleneck is the solution of the so-called reduced system which is block tridiagonal and strictly diagonally dominant by rows. The reduced system can be solved iteratively using the truncated reduced system matrix as a preconditioner. In this paper we derive a tight estimate for the quality of this preconditioner

    Numerical methods for large Lyapunov equations

    No full text
    Balanced truncation is a standard technique for model reduction of linear time invariant dynamical systems. The most expensive step is the numerical solution of a pair of Lyapunov matrix equations. We consider the direct computation of the dominant invariant subspace of a symmetric positive semidefinite matrix, which is given implicitly as the solution of a Lyapunov matrix equation. We show how to apply subspace iteration with Ritz acceleration in this setting. An n by n Lyapunov matrix equation is equivalent to a standard linear system with n 2 unknowns. Theoretically, it is possible to apply any Krylov subspace method to this linear system, but this option has not really been explored, because of the O(n2) flops and storage requirement. In this dissertation we show that it is possible to reduce these requirement to O(n) for Lyapunov equations with a low rank inhomogeneous right-hand side. We show how to accomplish the reduction for a variety of methods including GMRES, CG, BCG and CGNR. In each case the key observation is a special relationship between certain Krylov subspaces in [special characters omitted] and [special characters omitted]. It is theoretically possible to precondition a Lyapunov matrix equation which is written as a standard linear system. However, our investigation has revealed that the choice of preconditioners is extremely limited, if we are to keep the storage and flops count at O(n). Above all we have found that while it is certainly possible to reduce the resource requirements to O(n), the constants are too large to be competitive. The fundamental problem is that Krylov subspace methods are not taking advantage of the low rank phenomenon for Lyapunov matrix equations. Currently, the most successful Lyapunov matrix equations solver is the low rank cyclic Smith method. Central to this method is the automatic selection of certain shift parameters, preconditioners and the solution of certain linear systems. This is extremely difficult to accomplish in general, and the problem simplifies considerably in the special case in which the defining matrices can be reordered as narrow banded matrices. Currently, it is the solution of these narrow banded linear systems which is the bottleneck in an efficient parallel implementation of the low rank cyclic Smith method. In the final part of this dissertation we consider the parallel solution of narrow-banded linear systems. We do an error analysis of the truncated SPIKE algorithm which applies to systems which are strictly diagonally dominant by rows. Above all, we establish bounds on the decay rate of the spikes and the truncation error which are tight. We explain why this analysis only carries partially to the general case. Our analysis of the truncated SPIKE algorithm has immediate implications for the overlapping partition method (OPM). Finally we consider the question of reducing the amount of interprocessor communication during the solve phase for a general narrow banded linear system. The final conclusion is that such a system is essentially block diagonal in a sense which can be made very precise

    Any positive residual history is possible for the Arnoldi method for Lyapunov matrix equations

    No full text
    In this paper we consider the Lyapunov equation AX+XA^T+bb^T = 0, where A is negative definite n by n matrix and b in R^n. The Arnoldi method is an iterative algorithm which can be used to compute an approximate solution. However, the convergence can be very slow and in this paper we show how to explicitly construct a Lyapunov equation with a given residual curve. The matrix A can be chosen as symmetric negative definite and it is possible to arbitrarily specify the elements on the diagonal of the Cholesky factor of -A. If the symmetry is dropped, then it is possible to arbitrarily specify A+A^T, while retaining the residual curve

    Any positive residual history is possible for the Arnoldi method for Lyapunov matrix equations

    No full text
    In this paper we consider the Lyapunov equation AX+XA^T+bb^T = 0, where A is negative definite n by n matrix and b in R^n. The Arnoldi method is an iterative algorithm which can be used to compute an approximate solution. However, the convergence can be very slow and in this paper we show how to explicitly construct a Lyapunov equation with a given residual curve. The matrix A can be chosen as symmetric negative definite and it is possible to arbitrarily specify the elements on the diagonal of the Cholesky factor of -A. If the symmetry is dropped, then it is possible to arbitrarily specify A+A^T, while retaining the residual curve

    Any positive residual history is possible for the EKSM for Lyapunov matrix equations

    No full text
    Let A in be an n by n matrix and let B be an n by p matrix and consider the Lyapunov matrix equation AX+XA^T+BB^T=0. If A+A^T < 0, then the extended Krylov subspace method (EKSM) can be used to compute a sequence of low rank approximations of X. In this paper we show that any positive residual history is possible for the EKSM for Lyapunov matrix equations. In addition, we show how to systematically construct linear time invariant systems for which it is impractical to approximate the action of the product of the system Gramians using the EKSM. This is a property of the underlying Lyapunov matrix equations, rather than a defect of the algorithm
    corecore